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ABSTRACT 



Chapters in this report outline the potential plans for the 
redesign of the National Assessment of Educational Progress (NAEP) . It is 
argued that any successful redesign must consider the NAEP as a whole. This 
report reviews overall NAEP designs and discusses the implications that each 
of the designs has for various functional areas. The following chapters are 
included: (1) "Introduction" (Eugene G. Johnson and Stephen Lazer) ; (2) "An 

Integrated Approach to the Redesign of NAEP" (Eugene G. Johnson) ; (3) 

"Potential Designs for NAEP" (Eugene G. Johnson) ; (4) "Measuring Cognitive 

Skills" (Stephen Lazer, Robert J. Mislevy, Kim R. Whittington, and William 
Ward); (5) "Measuring Contextual Information" (Gita Z. Wilder); (6) 

"Sampling" (Keith Foster Rust and Juliet Popper Shaffer) ; (7) "Data 

Collection" (Nancy W. Caldwell); (8) "Scoring" (Christine Y. O'Sullivan); (9) 
"Analysis" (Eugene G. Johnson and James E. Carlson); and (10) "Reporting" 
(Stephen Lazer and Eugene G. Johnson) . An appendix contains the policy 
statement on redesigning the NAEP from the National Assessment Governing 
Board and "An Operational Vision for NAEP- -Year 2000 and Beyond" from the 
National Center for Education Statistics. (SLD) 
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Chapter 1 
Introduction 

Executive Summary 



This chapter discusses the general purpose of this report, which is to outline the 
potential plans for the redesign of the National Assessment of Educational Progress 
(NAEP). We view NAEP as an integrated system, where a change in any one of the 
functional areas — cognitive measurement, contextual questionnaire development, 
sampling, data collection, scoring, analysis, and reporting — will have impact on the 
others. Thus, we argue that any successful redesign effort must consider NAEP as a 
whole. Our report considers overall NAEP designs and discusses the implications that 
each of these designs have for the various functional areas. 
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CHAPTER 1: INTRODUCTION 



- Eugene G. Johnson / Stephen Lazer - 

For 27 years the National Assessment of Educational Progress (NAEP) has 
served as the nation's primary indicator of what students know and can do. Based on 
state-of-the-art measurement techniques, integrated use of cognitive and background 
questions, and representative national samples, NAEP has served as the country's best 
provider of reliable, objective information on student performances and on trends in 
academic achievement. NAEP data and reports are currently used in a variety of arenas 
and have informed the various debates about educational reform in the United States. 

Over the three decades of its existence, the National Assessment has become one 
of the most innovative and successful surveys regularly conducted in the United States. 
NAEP has been asked to meet a wide variety of goals and priorities, and these have 
imposed constraints and demands faced by no other educational assessment program. 
The National Assessment has been called on to measure student knowledge of broad 
content domains and to gather in-depth contextual information, at the same time 
minimizing the burden faced by individual participants. NAEP has pioneered the use 
of performance assessment methodologies in large-scale settings, and NAEP staff have 
determined ways to use computerized image-processing technologies to score 
performance exercises in a cost-effective and statistically reliable manner. 
Psychometricians working on NAEP have developed procedures that allow for the 
combination of multiple-choice and performance measures into integrated scales. 
National Assessment analysts, programmers, and authors have developed artificial 
intelligence systems that generate computer-written natural-language reports for states 
that participate in NAEP. Overall, NAEP has become a gold standard: a model of 
innovation and accuracy. 

However, it is perhaps NAEP's very successes that have created some of the 
strains that have led to the current redesign initiative. Because NAEP has shown a 
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consistent ability to satisfy program goals, new priorities have arisen — priorities that 
have often been in conflict with other program imperatives. In the late 1980s and early 
1990s, NAEP was simultaneously asked to increase its use of performance assessment 
exercises and to test larger numbers of students as parts of state samples. New 
definitions of assessment content defined in National Assessment Governing Board 
(NAGB) Frameworks necessitated assessments involving both multiple-choice and 
constructed-response questions and the combination of these item-types in core 
reporting scales. NAEP was called on to measure trends and to reflect the best and most 
up-to-date curricular practices. NAEP was asked to provide timely information for 
policymakers and to allow in-depth analyses by education researchers in various 
subject disciplines. The publication of America 2000: An Education Strategy and the 
related work of the National Education Goals Panel increased the relevance of NAEP 
data and led to demands for more timely and frequent reporting; these demands came 
precisely at the time that the National Assessment was becoming more complex and 
expensive to administer. These developments and imperatives tended to interact and 
intensify: NAEP s increasing visibility and proven record of success led policymakers 
and educators to view the National Assessment as a vehicle of curricular reform and to 
demand even greater innovation in instrument design. 

Overall, NAEP was called on to do more in an era of level funding. 
Concomitantly, the program's new priorities in no way excused it from its historical 
imperatives of providing the American public with statistical data of the highest 
quality, minimizing individual respondent burden, protecting participant 
confidentiality, responding rapidly to changes in policy, and allowing Department of 
Education policymakers maximum flexibility in their decision-making processes. 

Despite the many and varied challenges that NAEP has faced, the program has 
continued to meet the majority of its goals. NAEP instruments, sampling designs, 
administration procedures, and psychometric methodologies have become models of 
innovation, yet have remained operationally and analytically feasible. NAEP reports 
have served the needs of a wide variety of audiences. The program's expansion to the 
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state level has made NAEP the benchmark against which the success of educational 
reform efforts are measured and new programs are planned. NAEP's management and 
implementation have fostered a flexibility that has allowed time that was once needed 
for test development to be spent, instead, on building consensus about what is to be 
measured and how it is to be measured. In addition, this flexibility has enabled 
assessments to evolve within the context of maintaining trend data. And NAEP's 
matrix-sampled design has allowed content, rather than testing methodology, to be the 
driving factor in the construction of NAEP instruments. 

However, these successes have come at a price, where trade-offs have had to be 
made. With flexibility in schedule and instrument design has come analytic complexity. 
With performance testing have come further complications of analysis, difficulty in 
trend determination and, especially in the state program, significant expense. 
Evolutionary changes in assessments have required special bridging studies whose 
analyses must fall on the critical work path. New assessment Frameworks have 
invariably posed new developmental and psychometric challenges. Together, these 
changes in the assessment have prevented NAEP from realizing the efficiencies 
associated with the operational consistencies of most testing programs. All these factors 
have tended to add both complexity and cost to NAEP. With complexity has come 
lengthy reporting schedules. With expense has come limits on the number of 
assessments that can be administered. 

The realization that trade-offs are inherent in the design and conduct of the 
NAEP program has led many associated with NAEP to begin asking fundamental 
questions about the program's future directions. For example, do assessments that 
necessitate extensive use of performance testing, while feasible if administered to small 
national samples, prove prohibitively expensive in a state-level program? Can a 
National Assessment that is expensive to administer and score serve the state linking 
function which many now envision for it? Is it possible that re-crafting the current 
integrated NAEP structure in favor of a modular design — in which certain inexpensive 
instruments represent an assessment core while other, more innovative modules could 
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be given as needed and analyzed off the critical reporting path — might better serve the 
program's new missions? In general, can one instrument satisfy all the publics who 
may wish to use its results? These and other questions began to suggest to many that if 
the basic purposes and structures of NAEP were not reexamined, the program might 
not be able to fully meet all of its conflicting imperatives. 

The NAGB/NCES Redesign Initiative 

Realizing that NAEP faced questions about its basic mission and design, NAGB 
began a careful and thorough examination of the nature and purposes of the National 
Assessment and the ways it could be redesigned to better meet its goals. This process 
involved input from Board members and staff, an independent Design/Feasibility 
Team 1 made up of eminent psychometricians, and hundreds of concerned citizens. The 
result of this initiative was the NAGB adoption, on August 2, 1996, of the Policy 
Statement on Redesigning The National Assessment of Educational Pvogress. This statement 
argues that NAEP should have three core objectives that would serve as the means for 
accomplishing its legislatively-mandated purpose of providing a fair and accurate 
presentation of educational achievement. These objectives are: 

(1) to measure national and state progress toward the third National Education 
Goal 2 and provide timely, fair, and accurate data about student achievement 
at the national level, among the states, and in comparison with other nations 

(2) to develop, through a national consensus, sound assessments to measure 
what students know and can do, as well as what students should know and 
be able to do 



* resuhs of this team's work, published as Design/Feasibility Team: Report to the National Assessment Governing Board , had an 
important influence on the NAGB Redesign Statement and have also played a major role in organizing the work in this redesign 
planning effort t 

2 The third National Education Goal, called "Student Achievement and Citizenship/' states that: "By the year 2000 all students will 
leave grades 4, 8, and 12 having demonstrated competency over challenging subject matter including English, mathematics, science, 
foreign languages, civics and government, economics, arts, history, and geography, and every school in America will ensure that all 
students learn to use their minds well, so they may be prepared for responsible citizenship, further learning, and productive 
employment in our nation's modem economy" The National Education Goals Report: Building a Nation of Learners. National Education 
Goals Panel. (1996). Washington, DC 
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